Enhanced Information Access to Social Streams Through Word Clouds with Entity Grouping

نویسندگان

  • Martin Leginus
  • Leon Derczynski
  • Peter Dolog
چکیده

Intuitive and effective access to large volumes of information is increasingly important. As social media explodes as a useful source of information, so are methods required to access these large volumes of usergenerated content. Word clouds are an effective information access tool. However, those generated over social media data often depict redundant and mis-ranked entries. This limits the users’ ability to browse and explore datasets. This paper proposes a method for improving word cloud generation over social streams. Named entity expressions in tweets are detected, disambiguated and aggregated into entity clusters. A word cloud is generated from terms that represent the most relevant entity clusters. We find that word clouds with grouped named entities attain significantly broader coverage and significantly decreased content duplication. Further, access to relevant entries in the collection is improved. An extrinsic crowdsourced user evaluation of generated word clouds was performed. Word clouds with grouped named entities are rated as significantly more relevant and more diverse with respect to the baseline. In addition, we found that word clouds with higher levels of Mean Average Precision (MAP) are more likely to be rated by users as being relevant to the concepts reflected. Critically, this supports MAP as a tool for predicting word cloud quality without requiring a human in the loop.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards a Visually Enhanced Medical Search Engine

This paper presents the prototype of an information retrieval system for medical records that utilises visualisation techniques, namely word clouds and timelines. The system simplifies and assists information seeking tasks within the medical domain. Access to patient medical information can be time consuming as it requires practitioners to review a large number of electronic medical records to ...

متن کامل

A Study of the Application of Geographic Information Systems (GIS) in Children Access to Pharmacies: A Case Study of Kermanshah, West of Iran

Background Adequate access to health services has tremendous effects on the usefulness and efficiency of health care. Therefore, this study aimed to investigate the access of girls under the age of 14 years old to pharmacies in Kermanshah, Iran. Materials and Methods In this cross-sectional study, the access of Results In terms of access to 25 pharmacies through walking, the findings revealed ...

متن کامل

Beyond Monetization: Creating Value through Online Social Networks

Social networking sites are typically monetized via side payments, buy-clubs and affiliate programs, access controls, aggregation of content and integrated mobile platforms. However, there are other often untapped ways to create value through social networks, especially those related to enhanced interaction with customers, and knowledge creation and dissemination within organizations. In this a...

متن کامل

The Effect of Using Word Clouds on EFL Students’ Long- Term Vocabulary Retention

                                                                                                                                                                                                                       Vocabulary is an important component in all four skills of language. Issue of vocabulary retention has great importance to EFL teachers in instructional contexts because they always ...

متن کامل

Word Clouds with Latent Variable Analysis for Visual Comparison of Documents

Word cloud is a visualization form for text that is recognized for its aesthetic, social, and analytical values. Here, we are concerned with deepening its analytical value for visual comparison of documents. To aid comparative analysis of two or more documents, users need to be able to perceive similarities and differences among documents through their word clouds. However, as we are dealing wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015